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Abstract 

' In this paper the minimum spanning tree problem with uncertain edge costs is dis- 

, cussed. In order to model the uncertainty a discrete scenario set is specified and a robust 

' framework is adopted to choose a solution. The min-max, min-max regret and 2-stage 

O . min-max versions of the problem are discussed. The complexity and approximability of all 

these problems are explored. It is proved that the min-max and min-max regret versions 
with nonnegative edge costs are hard to approximate within 0{\og^~'^ n) for any e > 
unless the problems in NP have quasi-polynomial time algorithms. Similarly, the 2-stage 
min-max problem cannot be approximated within O(logn) unless the problems in NP 
' have quasi-polynomial time algorithms. In this paper randomized LP-based approxima- 

, tion algorithms with performance ratio of 0(log^ n) for min-max and 2-stage min-max 

• ' problems are also proposed. 

! 
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^ '■ 1 Introduction 

H ; 

The usual assumption in combinatorial optimization is that all input parameters are precisely 
known. However, in real life this is rarely the case. There are two popular optimization 
settings of problems for hedging against uncertainty of parameters: stochastic optimization 
setting and robust optimization setting. 

In the stochastic optimization, the uncertainty is modeled by specifying probability dis- 
tributions of the parameters and the goal is to optimize the expected value of a solution built 
(see, e.g., 13 [22]). One of the most popular models of the stochastic optimization is a 2-stage 
model [7]. In the 2-stage approach the precise values of the parameters are specified in the 
first stage, while the values of these parameters in the second stage are uncertain and are 
specified by probability distributions. The goal is to choose a part of a solution in the first 
stage and complete it in the second stage so that the expected value of the obtained solu- 
tion is optimized. Recently, there has been a growing interest in combinatorial optimization 
problems formulated in the 2-stage stochastic framework ^ [101 [121 HSl [21j . 

In the robust optimization setting [l7j the uncertainty is modeled by specifying a set of 
all possible realizations of the parameters called scenarios. No probability distribution in 
the scenario set is given. In the discrete scenario case, which is considered in this paper, we 
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define a scenario set by explicitly listing all scenarios. Then, in order to choose a solution, two 
optimization criteria, called the min-max and the min-max regret, can be adopted. Under the 
min-max criterion, we seek a solution that minimizes the largest cost over all scenarios. Under 
the min-max regret criterion we wish to find a solution which minimizes the largest deviation 
from optimum over all scenarios. A deeper discussion on both criteria can be found in |17j . 
The minmax (regret) versions of some basic combinatorial optimization problems with discrete 
structure of uncertainty have been extensively studied in the recent literature [21 [Sj [T4l I19j . 
Furthermore, both robust criteria can be easily extended to the 2-stage framework. Such an 
extension has been recently done in [8t I16j. 

In this paper, we wish to investigate the min-max (regret) and min-max 2-stage versions of 
the classical minimum spanning tree problem. The classical deterministic problem is formally 
stated as follows. We are given a connected graph G = {V, E) with edge costs Ce, e & E. We 
seek a spanning tree of G of the minimal total cost. We use $ to denote the set of all spanning 
trees of G. The classical deterministic minimum spanning tree is a well studied problem, for 
which several very efficient algorithms exist (see, e.g., [1]). 

In the robust framework, the edge costs are uncertain and the set of scenarios T is defined 
by explicitly listing all possible edge cost vectors. So, F = . . . , Sk} is finite and contains 
exactly K scenarios, where a scenario is a cost realization S = (cf)e6_E. In this paper we 
consider the unbounded case, where the number of scenarios is a part of the input. We will 
denote by C*{S) = minrg$ cf the cost of a minimum spanning tree under a fixed 

scenario G F. In the Min-max Spanning Tree problem, we seek a spanning tree that 
minimizes the largest cost over all scenarios, that is 

OPTi = min max V cf . (1) 

Te* Ser ^ ^ 

eeT 

In the Min-max Regret Spanning Tree, we wish to find a spanning tree that minimizes 
the maximal regret: 

OPT2 = min max i V cf - C* (S) I . (2) 



.eeT 



The formulation ([T]) is a single-stage decision one. We can extend this formulation to a 
2-stage follows. We are given the first stage edge costs Ce, e G E, and in the second 

stage there are K possible cost realizations (scenarios) listed in scenario set F. The 2-stage 
Spanning Tree problem consists in determining a subset of edges Ei in the first stage and 
a subset of edges that augments it to form a spanning tree = EiU E^ £ ^ under 
scenario S in the second stage for each scenario S £ T. The goal is minimize the maximum 
cost of the determined subsets of edges Ei , £'^^ , . . . , -E^^^'' : 

OPT3 = min max { Ce + c^ : = EiU E^ £ ^} . (3) 
e,,e!\...,e!'< ^er ^^^^ j 

Let us now recall some known results on the problems under consideration. In the bounded 
case (when the number of scenarios is bounded by a constant), the Min-max (Regret) 
Spanning Tree problem is NP-hard even if F contains only 2 scenarios |17j and admits an 
FPTAS [3], whose running time, however, grows exponentially with K. In the unbounded 
case, the MiN-MAX (Regret) Spanning Tree problem is strongly NP-hard [21 [T7] and 
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not approximable within (2 — e), for any e > 0, unless P=NP even for edge series-parallel 
graphs [H]. The Min-max (Regret) Spanning Tree problem is approximable within 
K [3]. However, up to now the existence of an approximation algorithm with a constant 
performance ratio for the unbounded case has been an open question. To the best of the 
authors' knowledge the 2-stage version of the minimum spanning tree problem seems to exist 
only in the stochastic setting [9l[10l[12]. Recently, the robust 2-stage framework has been 
employed in [8l [16] for some network design and matching problems. 

Our results In this paper we prove that the Min-max Spanning Tree and Min-max 
Regret Spanning Tree problems are hard to approximate with a constant performance 
ratio (Theorem [3] and Corollary [1]). Namely, they are are not approximable within 0{\o^~'' n) 
for any e > 0, where n is the input size, unless NP C DTIME(nP°'y We thus give a 
negative answer to the open question about the existence of approximation algorithms with a 
constant performance ratio for these problems. Moreover, if both positive and negative edge 
costs are allowed, then the Min-max Spanning Tree problem is not at all approximable 
unless P=NP (Theorem H]) . For the 2-stage Spanning Tree problem, we show that it is 
not approximable within any constant, unless P=NP, and within (1 — e) Inn for any e > 0, 
unless NPCDTIME(n'°si°g") (Theorem [6]) . The above negative results encourage us to find 
randomized approximation algorithms, which yield a O(log^n) approximation ratio for MlN- 
MAX Spanning Tree (Theorem [5]) and 2-Stage min-max Spanning Tree (Theorem [7]) . 

2 Min-max (regret) spanning tree 

In this section, we study the MiN-MAX Spanning Tree and Min-max Regret Spanning 
Tree problems. We improve the results obtained in ^E], by showing that both problems 
are hard to approximate within a ratio of ©(log^"*^ n) for any e > 0, unless the problems in NP 
have quasi-polynomial time algorithms. We then provide an LP-based randomized algorithm 
with approximation ratio of O(log^n) for Min-max Spanning Tree. 

2.1 Hardness of approximation 

We reduce a variant of the Label Cover problem (see e.g., [SlIIl]) to Min-max Spanning 
Tree. 

Label Cover: Input: A regular bipartite graph G = {y^W,E)^ £' C 1/ x W] an integer 
that defines the set of labels, which are in integers in {1, ... , A}; for every edge {v, w) G 
E a partial map a^^^ '■ {l,---,-A} — {!,..., A}. A labeling of the instance C = 
{G,N,{ay^w}(v,w)&E) is a function I assigning a nonempty set of labels to each vertex 
in y U W , namely I :V \JW ^ 2^ . A labeling satisfies an edge (f , w) G E ii 

3a G /(f), 36 G l{w) : ay^yj{a) = b. 

A total labeling is a labeling that satisfies all edges. The value of a total labeling / is 

maxa;gi/uw 

Output: A total labeling of the minimum value. This value is denoted by val{C). 
We now recall the following theorem [U [19] : 
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Theorem 1. There exists a constant 7 > so that for any language L G NP, any input w 
and N > 0, one can construct an instance C of Label Cover, with |w|'^('°s^) vertices and 
the label set of size N, so that: 

w E L =^ val{C) = 1, 
w L val{C) > N'<. 

Furthermore, C can he constructed in time polynomial in its size. 

We now state and prove the theorem, which is essential in showing the hardness results 
for the problems of interest. 

Theorem 2. There exists a constant 7 > so that for any language L G NP, any input w, 
any N > and any g < N'^ , one can construct an instance T of Min-max Spanning Tree 
in time 0{\w\^^3^°^N)j^oig)^^^ 

w e L ^ OPTi{T) < 1, 
W0L OPTi{T)>g. 

Proof. Let L be a language in NP and let C = {G = iV,W, E), N, {av,w}{v,w)£E) be the in- 
stance of Label Cover from Theorem [1] constructed for L. Let us introduce some additional 
notations: 

• 6{x) is the set of edges of G incident to vertex x S 1/ U PF, 

• N^,w = {{a, b) eN X N : cr„,^(a) = b}. 

We now transform C to an instance T of MiN-MAX Spanning Tree. Let us fix 5 < A^"^, 
where 7 is the constant from Theorem [TJ We first construct graph G' in the following way. 
We replace every edge {v,w) £ E with paths {v,u^^^ ,w^) for all (a, 6) G iV^,^^ (see Figure [I]). 
The edges of the form (it^'^ , w"" ) (the dashed edges) are called dummy edges and the edges 
of the form {v,u"^^) (the solid edges) are called label edges. We say that label edge {v^u"^^) 
assigns label a to u and label b to w. We will denote the obtained component by G^^^ and we 
will use E\,^^ to denote the set of all label edges of G^^w, obviously l-E^^^,] = l-ZV-y.^)!- We finish 
the construction of G' by adding additional vertex s and connecting all the components by 
additional dummy edges (s,f) for all v ^V. A. sample graph G' , where G is -fCs^s, is shown 
in Figure [21 

We now form scenario set T. We first note that all dummy edges under all scenarios have 
costs equal to 0. We say that two label edges are label- distinct if they do not assign the same 
label to any vertex v or w. Namely, {v,u^^^f^ ) and {v',u"^,'^, ) are label-distinct if Oj = a[ 
implies v ^ v' and 6j = b'^ implies w ^ w' . Consider vertex v £ V, for which there is the 
set of p = \5{v)\ components Q = {G^^wi, ■ ■ ■ ,Gv^wp}- For every subset T O Q of exactly 
g components, = . . . ,G.u^wg} and for every (/-tuple of pairwise label-distinct edges 

{{v, u^'^^^), . . . ,{v, v"f"f, )) G E\j,^_^ X • • • X E\j^^^ we form scenario under which all these edges 
have cost 1 and all the remaining edges have cost 0. We repeat this procedure for all vertices 
V £V . Consider then vertex w G W , for which there is the set of g = |^(u')| components Q = 
{Gvx,w, ■■■ , Gvq,w}- For every subset J" C of exactly g components, T = {G 
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Figure 2: A sample of graph G' , where graph G in £ is i^3,3. 

and for every g'-tuple of pairwise label-distinct edges {{vi,u'^^^^), • • • , {'^g^'^^ag'bg)) ^ ^vi,w ^ 
• • • X E\j^^^ we form scenario under which all these edges have cost 1 and all the remaining 
edges have cost 0. We repeat this for all vertices w G W. In order to ensure F / 0, we include 
in F the scenario in which every edge has zero cost. 

Assume that w G L and thus val{C) = 1. Thus, there exists a total labeling I satisfying 
all edges in G such that Tnax-x&vvjw \K^)\ — 1- Each edge {vi,Wi) G E in G corresponds 
to the exactly one component G^^^w, in G . Let {ai,bi) be the pair of labels satisfying the 
edge {vi,Wi) in total labeling I, i.e. G l{vi) and 6j G l{wi). We form a spanning tree T in 
G' by adding exactly one edge ivi,u^^''^^) from every component G^^^w^ and we complete the 
construction by adding a necessary number of dummy edges. Since the labeling I is such that 
maxa^gyuiy — 1; P^i^ of label-distinct edges have been chosen while constructing T, 
so X^gg-pcf < 1 for all S G F and consequently vnsiyas&T^eer^e ^ 1- 

Assume that w ^ L and thus m.ayixeVvjW — — 9 total labcllings I. 

Consider any spanning tree T in G'. Without loss of generality, we can assume that T contains 
exactly one label edge from every component Gy^yj. The set of all label edges contained in T 
corresponds to a total labeling I of C Since \l{x)\ > g, for some vertex x eV U W, we have 
to use at least g distinct labels in the labeling I. Suppose that x = v £V and we use distinct 
labels ai, . . . ,ag for v. Then, T contains pairwise label-distinct edges (v, u"^%j), i = ^, ■ ■ ■ , 9, 
and YleeT cf = 5 under scenario S that correspond to this (/-tuple of edges. The reasoning 
for X = u;, 1/; G is the same. In consequence max5gr J2e£T — 9 OPTi{T) = g. 

Let us now examine the size of the resulting instance of the MiN-MAX Spanning Tree 
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problem. The size of the set of edges E is at most \V\ + 2\E\N'^, the size of the set of vertices 
V' is at most 1 + |y| + |£;|iV2 _^ |^||y| ^nd the number of scenarios is at most l-\-2\E\3 . 
Hence, and from \E\ = we deduce that the size of the constructed instance (G',r) 

is \w\Oi9iosN)]yOig)^ SQ ^^^^ constructed in 0(|w|'^(5iogiV) jvOCs)) □ 

From Theorem [21 we obtain the following result: 

Theorem 3. The Min-max Spanning Tree problem with nonnegative edge costs under all 
scenarios is not approximable within 0{log^^'' n) for any e > 0, where n is the input size, 
unless NP C DTIME{n^°^y^°^'') . 

Proof. Let 7 be the constant from Theorem [21 For any /3 > we fix g = log^ |w| and 
= log'^'-'^'' |w|, so that inequality g < is satisfied for the constant 7 (see Theorem [2]). 
The input size of the resulting instance {G' ,T) from Theorem [2] is n = \w\^^3^ogN) ]\^o{g) _ 
|^|0{iog |w|) £qj, gome constant 5 > 0, so it can be constructed in 0(|w|p°^^'°sI^I) time. Since 
g = \og^ |w| and n = 2'-^^^°^^^^^^ \M) ^ -we get g = 0(log^s+4+T 7^) and the gap is 0(log^~^ n) for 
any e > 0. 

□ 

Corollary 1. The Min-max Regret Spanning Tree problem is not approximable within 
0(\.og^~'' n) for any e > 0, where n is the input size, unless NP C DTIME{vP"^^^"^'^) . 

Proof. The corollary follows easily if we assume that each component Gv^w in the construction 
from Theorem[2]has at least 2 label edges or, equivalently, every edge in the instance of Label 
Cover has at least two pairs of labels. In this case, under every scenario 5 G F, there is 
a spanning tree of cost (recall that we never assign two I's to the same component in S). 
Hence OPTi{T) = OPT2{T) and the proof is completed. If some edge in the instance of 
Label Cover has only one pair of labels, then this pair trivially forces an assignment of 
labels to two vertices, which (after checking consistency with other edges) can be removed 
from the instance before applying the construction from Theorem [21 

□ 

Up to this point we have assumed that the edge costs under all scenarios are nonnegative. 
The following theorem demonstrates that violation of this assumption makes the Min-max 
Spanning Tree problem not at all approximable: 

Theorem 4. // both positive and negative costs are allowed, then the Min-max Spanning 
Tree problem is not at all approximable unless P=NP even for edge series-parallel graphs 

Proof. We show a gap- introducing reduction from 3- SAT which is known to be strongly 
NP-complete [l3]. 

3-SAT: Input: A set ?7 = {xi, . . . , of Boolean variables and a collection G = {Gi, . . . , Gm} 
of clauses, where every clause in G has exactly three distinct literals. 

Question: If there is an assignment to U that satisfies all clauses in C? 

We will assume that in the instance of 3-SAT for every variable Xi both Xi and ~ Xi appear 
in G. Obviously, under such assumption 3-SAT remains strongly NP-complete. Given an 
instance of 3-SAT we construct an instance of Min-max Spanning Tree as follows. For 
each clause = (ij V if V if) we create a graph Gi composed of 5 vertices: Si,v\,V2,vl,ti 
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and 6 edges: the edges {si,v\), (sj,f2), {si,vl) correspond to literals in Cj, the edges {vl,ti), 
{v2,ti), {v^, ti) have costs equal to —1 under every scenario. In order to construct a connected 
graph G = {V, E) with \V\ = 4m + 1, \E\ = 6m, we identify vertex ti of Gj with vertex Sj+i 
of Gj+i for i = 1, . . . m — 1. Note that the resulting graph G is edge series-parallel. Finally, 
we form scenario set F as follows. For every pair of edges of G, and {sq,Vr), that 

correspond to contradictory literals ll and i.e. ll =~ we create scenario S such that 
under this scenario the costs of the edges {si,v]j) and {sq,Vr) are set to 4m — 1 and the costs 
of all the remaining edges are set to —1. It is easy to verify that each spanning tree T in the 
constructed instance has nonnegative maximal cost over all scenarios. 

Suppose that 3-SAT is satisfiable. Then there exists a spanning tree T of G containing 
exactly 4m edges that do not correspond to contradictory literals. Thus, under every sce- 
nario S, the tree contains at most one edge with the cost 4m — 1 and all the remaining 4m — 1 
edges have costs equal to —1. In consequence we get Yle&T'^e — under every S G F and 
OPT\ = 0. If 3-SAT is unsatisfiable, then every spanning trees T of G contains at least 
two edges which correspond to contradictory literals, and so OPT\ = maxggr X^eeT — 4"^- 
Consequently MiN-MAX Spanning Tree is not approximable, unless P=NP. Otherwise, any 
polynomial time approximation algorithm applied to the constructed instance could decide if 
an instance of 3-SAT is satisfiable. □ 



2.2 Randomized algorithm for min-max spanning tree 

If the edge costs are nonnegative under all scenarios, then the Min-max Spanning Tree 
problem is approximable within K, K is the number of scenarios, and this is the best approx- 
imation ratio known so far [3]. On the other hand the problem is not at all approximable if 
negative costs are allowed (Theorem H]). In this section, we assume that all costs are nonnega- 
tive and we give a polynomial time approximation algorithm for the problem which returns an 
0(log^ n)-approximate spanning tree, where n is the number of vertices of G. The algorithm 
is based on a randomized rounding of a solution to an iterative linear program. 

It is easy to check that binary solutions to the following program iv-Pminmax(G) are in one- 
to-one correspondence with solutions to Min-max Spanning Tree of edge costs in every 
scenario at most G: 

^Pminmax(G): ^ cf Xe < C V^eP, (4) 

^Xe = n-1, (5) 

eS-E 

^ Xe > 1 Vvi/cF, (6) 

< Xe < 1 VeeiJ, (7) 

if cf > C then Xg = Vee_B and Vsgr, (8) 

where 5{W) denotes the cut determined by vertex set VK, i.e. 5{W) = £ E : i £ 

WJ £V\W}. The core of i-Pminmax 

(C) (constraints ©-([T])) is the relaxation of the cut-set 
formulation for spEinning tree [l8j . The polyiioniia,! tiniG solvability of -^^-fminmax 

(G) follows 

from an efficient polynomial time separation based on the min-cut problem (see |18j). Solving 
-Z^^minmax(G) consists in rejecting all edges e £ E having > C under some scenario S F 
and solving then the resulting linear programming problem. Using binary search in [0, (n — 
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l)cmax], where Cmax = maXee_B max^gr cf , one can find the minimal value of parameter C, 
for which there is a feasible solution to i>-Pminmax(C')- Let C be this minimal value and let 
(xe)ee-E be a feasible solution to -Z>-Pminmax(C')- Clearly C < OPTi. Furthermore, if > 0, 
then < C and thus cf < OPTi for each scenario 5 G F. 

We now give an algorithm that randomly rounds a feasible solution of -^^-Pminmax(C) to an 
0(log^ n)-approximate min-max spanning tree (see Algorithm [TJ . 



Algorithm 1: Randomized algorithm for MiN-MAX Spanning Tree 
Use binary search in [0, (n — l)cmax] to find the minimal value of C such that there 
exists a feasible solution to -L-Pminmax(C'), i.e., C and (xe)ee-B- 
Initially F contains only vertices of G, that is n components. 
r ^ [2(11 + ^/2T) Inn] 
for ^ 1 to r do 

For all e £ E, add edge e independently with probability Xe to F. 
if F is connected then 
|_ exit for- loop 

if F is connected then 
|_ return a spanning tree of F 



Let us analyze Algorithm [TJ Obviously the algorithm is polynomial. The following lemma 
shows that the total cost of edges included in each iteration under any scenario S" € F is 
0{lnn)OPTi with probability at least 1 — ^: 

Lemma 1. Let E^ he a set of edges added to F at iteration k of AlgorithmUl and let K < n^^ , 
1 ^ / ^ n-P'^ , where f , pi, p2, ps are nonnegative constants such that P2 + Ps < 3.92 • pi, 
pi>2. Then 



. ( I Ini^ + ln/^ 

<(,,lnn+1.5) (^1+2^1 + ^;^^ 



max cf < {pi In n + 1.5) ( 1 + 2 J 1 + ^^^3^;::^ ) OPT^ (9) 



holds with probability at least 1 — j^p-^~i ■ 

Proof. See Appendix lAl □ 

We now analyze the feasibility of an output solution F . Let F\^ be the forest obtained from 
Ffc-i after the A;-th iteration. Initially, Fq; -^0 C G, has no edges. Let Cfe denote the number 
of connected components of F^. Obviously, Cq = n. We say that an iteration k is "successful" 
if either C^-i = 1 (Fk-i is connected) or < 0.9Cfc_i; otherwise, it is "failure". We now 
recall a result of Alon [3] (see also [9]). His proof is repeated in Appendix[X]for completeness. 

Lemma 2 (Alon [4J). For every k, the conditional probability that iteration k is "successful" , 
given any set of components in Fk_i, is at least 1/2. 

From Lemma [21 it follows that the probability of the event that iteration k is "successful" 
is at least 1/2. This is a lower bound on the probability of success of given any history. Note 
that, if forest F^ is not connected (C^ > 1) then the number of "successful" iterations has been 
less than logg g n < 10 Inn. Let X be a random variable denoting the number of "successful" 
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iterations among r performed iterations of the algorithm. The probabihty Pr[X < 10 Inn] 
can be upper bounded by Pr[Y < 10 Inn], where Y = Yl^k=i^k is the sum of r independent 
Bernouhi trials such that Pr[Yfc = 1] = 1/2. This estimation can be done, since we have a 
lower bound on success of given any history. Clearly, E[Y] = r/2. We apply the Chernoff 
bound (see for instance j^Oj) and determine the values of (5 G (0,1] and r in order to fulfill 
the following inequality: 



Pr[X < 10 Inn] < Pr[Y < 10 Inn] = Pr[Y < (1 - '5)E[Y]] < e-^W<52/2 ^ 1^ (jg) 

n 

It is easily seen that inequality (|10|) holds if the following system of equations 

(11) 

holds true. An easy computation for 6 and r in (|lip . shows that r = 2(11 + \/2T) Inn, 6 = 



(1 - 5)r/2 = 10 Inn, 
r5^/4 = Inn 



jj-p^^. Hence, after r iterations, r = [2(11 + v2T)liin], we obtain with probability at 

least 1 — 1/n a spanning tree. By the union bound and Lemma[T](set / = r), with probability 
at least 1 — 1/n in every iteration, k = 1, . . . ,r, the set of edges Ei^ included at iteration k 
satisfies the bound Q. We conclude that after r iterations, we get with probability at least 
1 — 2/n a spanning tree whose total cost in every scenario is 0{r In n)OPTi. We have, thus 
proved the following theorem: 

Theorem 5. There is a polynomial time randomized algorithm for Min-max Spanning 
Tree that returns with probability at least 1 — - a solution whose total cost in every scenario 
is 0{log^n)OPTi. 



3 2-stage spanning tree 

In this section, we discuss the 2-stage spanning tree problem in robust optimization 
setting. We show that the problem is hard to approximate within a ratio of O(logn) unless 
the problems in NP have quasi-polynomial algorithms. Then, we give an LP-based randomized 
approximation algorithm with ratio of 0(log^ n). 



3.1 Hardness of approximation 

Theorem 6. The 2-Stage Spanning Tree problem is not approximable within any con- 
stant, unless P=NP, and within (1 - e)lnn for any e > 0, unless NPCDTIME{n^''^^°^'^). 



Proof. We proceed with a cost preserving reduction from Set Cover to 2-Stage Spanning 
Tree. The reduction is similar to that in [12] for the 2-stage stochastic spanning tree. Set 
Cover is defined as follows (see, e.g., [5| [T3]): 

Set Cover: Input: A ground set ^ = {1, . . . , n} and a collection of its subsets Ui, . . . , Um 
such that Ui^i = ^• 

A subcollection / C {1, . . . ,m} covers lA if Uie/ — ^i where |/| is the size of the 
subcollection. 

Output: A minimum sized subcollection that covers li. 
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The Set Cover problem is not approximable within any constant, unless P=NP, and 
within (1 - e)logn for any e > 0, unless NPCDTIME(ni°siog«), where n is the size of the 
ground set (see [SIHI]). For a given instance C = iU, C/i, . . . , Um) of Set Cover, we construct 
an instance T = {G = {V,E),T) of 2-Stage Spanning Tree as follows. Graph G = {V,E) 
is a complete graph with m + n + 1 vertices V = {ui, . . . , Um, 1, . . . , n, r}. Vertices ui, . . . , 
correspond to m subsets Ui, . . . , Um, vertices 1, . . . , n correspond to n elements of set U. The 
costs of the edges {r,Ui), i = 1, . . . ,m, in G in the first stage are set to 1 and the costs of all 
the remaining edges in G are set to m + 1. Now we form scenario set T in the second stage. 
Each scenario Sj £ T corresponds to vertex j, j = 1, . . . ,n. Let Tj = {j} U {uj : j G Uj} 
and let (Tj, V \Tj) be the cut separating Tj from all other vertices of G. Each second stage 
scenario Sj is defined as: the costs of the edges from cut (Tj, V\Tj) are set to m + 1 and the 
costs of the remaining edges in G are set to 0. 

We now prove that there is a subcollection of size at most k < m that covers U if and only 
if there exists a spanning tree in G of the maximum 2-stage cost at most k < m. Given a 
subcollection Ui^ , • • • , Ui^ of size k that covers U. In the first stage, we include in Ei the edges 
{r,Ui^), where vertices Ui- correspond to subsets Ui^, j = 1,. . . ,k. The cost of Ei is equal 
to k. In the second stage, we augment Ei to form a spanning tree with edges of cost zero in 
each scenario Sj, j = 1, . . . ,n. Hence, the maximum 2-stage cost of the obtained spanning 
tree equals k. Conversely, let T be a spanning tree in G with the maximum 2-stage cost at 
most k. Hence, this tree does not contain any edge with cost m + 1. Consequently, in the 
first stage the tree contains k' < k edges of the form {r,Ui.), j = 1, . . . ,k , and in the second 
stage in each scenario it contains zero cost edges. The vertices Ui- correspond to subsets Ui-, 
j = 1, . . . ,k' . It is easily seen that any element i £ U must be covered by at least one of 
subsets Ui-, j = 1, . . . ,k . Otherwise the solution would contain an edge of cost m + 1. Thus, 
Ui^, j = 1, . . . ,k' , form a subcollection of the size at most k that covers U. 

The presented reduction is cost preserving. Hence, 2-Stage Spanning Tree has the 
same approximation bounds as Set Cover. □ 

3.2 Randomized algorithm for 2-stage spanning tree 

In this section we construct a randomized approximation algorithm for 2-Stage Spanning 
Tree, which is based on a similar idea as the corresponding algorithm for MiN-MAX Span- 
ning Tree (see Section l2.2p . Consider the following program LP2stage{G), whose binary 
solutions correspond to the solutions of 2-Stage Spanning Tree: 



LP2stageiG) : CgXe + ^ cf xf < C V^gP 

eeE eeE 

^{xe + xf) =n - 1 Vser 
^ (xe + xf ) > 1 \/wcv, V^er 

eeS{W) 

< Xe,xf < 1 Vee_B, Vsgr 

if Ce > C then Xe = Vee e 

if cf > C then xf = Veg_E, V^gr 
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The algorithm (Algorithm [2]) randomly rounds a feasible solution Xe, xf, 5 G F, e G of 
LP2stageiC), where C denotes the minimal value of C for which there is a feasible solution to 

Algorithm 2: Randomized algorithm for 2-STAGE Minimum Spanning Tree 
Cmax ^ maxee£;{ce, max5gr cf } 

Use binary search in [0, (n — l)cmax] to find the minimal value of C such that there 
exists a feasible solution of LP2stage {C) , ^, SeT,e€E. 

Initially contains only vertices of G for 5 G F. 
r ^ [(\/lnn + InK + V21 Inn + Inii:)^] 
for /c ^ 1 to r do 

/n the first stage: For all e €z E, choose edge e independently with probability Xe 
and add it to each for S" G F. 

In the second stage: for every 5 G F and every e £ E, add edge e independently 
with probability if to F^ . 

if all F^ , 5 G F, are connected then 
|_ return {F'^j^gr 



An analysis of Algorithm [2] proceeds similarly as the one of Algorithm [H The following 
lemma holds (the proof goes in similar manner as the proof of Lemma [1]) : 

Lemma 3. Let E^ and E^ be the sets of edges in the first stage and in the second stage 
for every S G F, respectively, added to F'^ at iteration k of Algorithmic and let K < n''^ , 
1 ^ / ^ n^'^ , where f , pi, p2, ps are nonnegative constants such that P2 + Ps < 3.92 • pi, 
pi>2. Then 



E + E < (Pi Inn + L5) (l + OPT, V^er 



(12) 



holds with probability at least 1 — j^p^-i ■ 

Let F^ be the forest for S" G F after the fc-th iteration of Algorithm [21 Let denote the 
number of connected components of F^ . Again, we say that an iteration k is "successful" if 
either C^_^ = 1 or < 0.9C^_^; otherwise it is "failure". The probability of the event that 
iteration k is "successful" is at least 1/2, which is due to Lemma [2J 

Consider any scenario S G F. If forest F^ is not connected then the number of "successful" 
iterations is less than logg g n < 10 In n. We estimate Pr [X < 10 In n] by Pr [Y < 10 In n] , where 
X is random variable denoting the number of "successful" iterations among r iterations and 
Y = Ylk=i^k is the sum of r independent Bernoulli trials such that Pr[Yfc = 1] = 1/2, 
E[Y] = r/2. We use the Chernoff bound and compute the values of 6 £ (0, 1] and r satisfying 
the following inequality: 

Pr[X < 10 Inn] < Pr[Y < 10 Inn] = Pr[Y < (1 - S)B[Y]] < e'^^^^^^^^ = (13) 

nil 

This gives r = (Vlnn + lnK + V211nn + lnK)2 and 5 = ^= 2Vin«+ini^ ^qc^W that 

K is the number of scenarios. By the union bound, the probability that a forest in at least 
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one scenario S is not connected is less than 1/n. Again, by the union bound and Lemma [T] 
(set / = r), with probabihty at least 1 — 1/n in every k iteration, k = 1, . . . ,r, the sets of 
edges and for each S" G F, included at iteration k, satisfy the bound (fT^ . Thus, after 
r iterations, r = [(-v/lnn + Ini^ + \/21 Inn + Ini^)^] , with probability at least 1 — 2/n, we 
obtain spanning trees of cost 0(r In n)0PT3 in every scenario. We get the following theorem: 

Theorem 7. There is a polynomial time randomized algorithm for 2-STAGE Minimum Span- 
ning Tree that returns with probability at least 1 — ^ a spanning tree whose cost in every 
scenario is O {log^ n)OPT3. 
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A Some proofs 

Proof. (Lemma [1]) In order to prove the bound Q, we will apply a technique used in |161ll5j. 
Consider any scenario E P. Let us sort the costs in S in nonincreasing order c^j^^j > c^^2] — 
• • • > Cgj^j , (m is the number of edges of G) . We partition the ordered set of edges E into 
groups as follows. The first group G^^'^ consists of edges e[l], . . . , e[j*^^)], where j^^^ is the 
maximum such that Xe[i] + • • • + ^e[jW] ^ Pi Inn. The subsequent groups G«, l = 2,...,t, 
are defined in the same way, that is G*-'^ consists of edges e[j^''~^^ + 1], . . . , e[j^''^], where 
the maximum such that x^y(i-i)^-^^^ + • • • + ^e[j(0] ^ Pi Inn. The optimal value OPTi satisfies: 

m t 

OPT,>C>^c%^x,[^>Y, 
1=1 1=1 



t-i 



mm c„ 

eeG« 



E 



> (pi Inn — 1) min c' 



1=1 



(14) 
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Let Xg be a binary random variable with Pr[Xe = 1] = Xe- It holds 



1=1 eeGW 



^=1 eeG(0 



< ( max cf V Xe + V 



( max cf )Xe 



( min c" 

eeG('-i) 



E6G(0 



(15) 



Let us recall a Chernoff bound (see e.g., [20]). Suppose Xi, . . . ,Xjv are independent Poisson 
trials such that Pr[Xi = 1] = pi. Let X = Xj Then the inequality holds: Pr[X > 

E[X](l+(5)] < e-^M'''/^ for any S < 2e-l. We use this Chernoff bound to estimate Ee6G(0 
in each group G^^\ Consider a group G^'^. It holds ^[^eecC-'i -^e] ~ SeeGW — '^i ^^'^^ 
(5 = 2 y/(pi In n + In ivT + In f)/{pi In n). Since K < n''^, 1 < / < n^'s and p2 + p3 < 3.92 • pi, 
pi > 2, inequality 5 < 2e — 1 holds. Thus the Chernoff bound yields: 



Pr 



^ Xe > pilnn{l + 5) 

eeGW 



< e" 



-(pi Inn+lnK+ln/) _ l/(^fXn'^^). 



(16) 



By the union bound, the probability that X]eeG(0 -^e ^ /'i lnn(l + 5) holds for at least one 
group G^'^ is less than 1/ {f Knf^^^) (because the number of groups is at most n). Now 
applying the bound XleeGto -^e < Pi lnn(l + 5) for every / = 1, . . . ,t to and using the 
fact that maXgg(^(i) iff < OPTi and inequality (I14p we obtain: 



^ cf < pilnn 1 + 2^ 

eeSfc V 



' pi In n + In + In / 



pi Inn 



OPTi + 



OPTi 
pi In n — 1 



An easy computation shows that: X^eG-Efe '^e ^ (/?i Inn + 1.5) (1 + 2^/1 + 



Ini^+ln/ 
pi Inn 



OPTi. 

The probability that the bound fails for a given scenario S is less than l/^fKn^^^'^) so, 
by the union bound, the probability that it fails for at least one scenario S G F is less 
than l/(/n^i-i). □ 

Proof. (Lemma [2]) If -Ffc-i is connected then we are done. Otherwise, let us denote hy H = 
{Vh,Eh) the graph obtained from -Ffc_i by contracting its every connected components to a 
single vertex. An edge e is not included in with probability 1 — Xg- Hence, the probability 
that any vertex v oi H remains isolated is 



< exp( 



Ed 

ee<5(t)) 



^e)) < 1/e, 



where 5{v) denotes the set of edges incident to v. The last inequality follows from the fact 
that X]ee5{»;)(-'-~^e) — linearity of expectation, the expected number of isolated vertices 

of H is jVffl/e, and thus with the probability at least 1/2 the number of isolated vertices is 
at most 2|V//|/e. Hence, the number of connected components of is at most 



21^^! 1 
e 2 



\Vh\ 



2\Vh\ 



1 1 

2 + e 



\Vh\ < 0.9\Vh\. 



Since IV^I = Ck-i, the lemma follows. 



□ 



14 



